Skip to content

Serialize and restore compiled schema#115

Open
Ahmedmhmud wants to merge 7 commits intohyperjump-io:mainfrom
Ahmedmhmud:serialize-compiled-schema
Open

Serialize and restore compiled schema#115
Ahmedmhmud wants to merge 7 commits intohyperjump-io:mainfrom
Ahmedmhmud:serialize-compiled-schema

Conversation

@Ahmedmhmud
Copy link
Copy Markdown

This PR adds experimental helpers to serialize and deserialize compiledSchema so applications can persist and restore compiled validation logic more easily. It handles the non-JSON parts of a compiled schema by preserving RegExp values and restoring EvaluationPlugins by ID, first from built-in plugins and then from an optional pluginsById map for custom plugins, with a clear error when a plugin cannot be found. It also wires the helpers into the experimental API surface and includes focused tests and error cases.

Copy link
Copy Markdown
Collaborator

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using replacer/reviver for plugins isn't the best approach. The pattern matching approach used by isPlugin feels clunky. Since those plugins are always in a known location, I think it makes more sense to convert them before serialization.

Converting RegEpx is different because they could be anywhere and determining if something is a RegExp is trivial.

Plugins isn't actually a Set, it's an array. That was a error in the type that has already been corrected. Rebase your branch to get the correct type.

The exported functions should be a the top of the file with helper functions below.

Using __type isn't going to work. If someone uses that as a property name in their schema, this approach could break. You'll need to pick a key that won't collide with anything that could end up in a schema. I suggest a UUID. That way it's effectively impossible for a user to accidentally create a schema with a conflict.

@Ahmedmhmud
Copy link
Copy Markdown
Author

Converted plugin objects in plugins to UUID-marked references before serialization to avoid shape-matching and property collisions. RegExp values are serialized and revived using a UUID marker. Deserializer accepts both the new UUID markers and legacy __type markers for compatibility. Updated also tests in compiled-schema-serialization.spec.ts.

Comment thread lib/compiled-schema-serialization.js Outdated
const REGEXP_MARKER = "a6d8f3e1-9b2c-4f7a-8d5b-1c2e3f4a5b6c";

export const serialize = (compiledSchema) => {
const ast = compiledSchema?.ast ? { ...compiledSchema.ast } : undefined;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ast isn't optional on compiledSchema.

Comment thread lib/compiled-schema-serialization.js Outdated
Comment on lines +9 to +10
if (ast && (Array.isArray(ast.plugins) || ast.plugins instanceof Set)) {
ast.plugins = [...ast.plugins].map((plugin) => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ast.plugins will never be a Set.

Comment thread lib/compiled-schema-serialization.js Outdated
if (!plugin?.id) {
throw Error("Cannot serialize plugin without id");
}
return { [PLUGIN_MARKER]: true, id: plugin.id };
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PLUGIN_MARKER isn't necessary because we're not using replacer/reviver for this part. You should be able to just replace the plugin with its id.

Comment thread lib/compiled-schema-serialization.js Outdated
Comment on lines +39 to +42
// Legacy support: old tests/serializations used a __type marker.
if (value.__type === "RegExp") {
return new RegExp(value.source, value.flags);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__type was introduced in this PR. No one is depending on it. This looks like a mistake an AI coding tool would make. If you use those tools make sure you check everything they do and make sure it makes sense.

Comment thread lib/compiled-schema-serialization.js Outdated
Comment on lines +44 to +46
if (value[PLUGIN_MARKER]) {
return resolvePlugin(value.id, options);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't restore plugins in the reviver. Do it as a separate step like in serialize.

Comment thread lib/compiled-schema-serialization.js Outdated
Comment on lines +48 to +55
if (value.__type === "Plugin") {
return resolvePlugin(value.id, options);
}

if (value.__type === "PluginSet") {
// Legacy PluginSet: restore to an array of plugin objects
return (value.values || []).map((p) => resolvePlugin(p.id, options));
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary

@Ahmedmhmud
Copy link
Copy Markdown
Author

Thanks for heads up, I will update it.

@Ahmedmhmud
Copy link
Copy Markdown
Author

Ahmedmhmud commented May 6, 2026

Hi, @jdesrosiers
I have a question about plugins
You said earlier that plugins aren't set, they are an array
But compile() still produces plugins as Set() in lib/core.js and also in lib/validation.js treats plugins as a set, so what do you suggest?
Do I check if it is as a set like I was doing?
or I can open a new PR to fix it and all the code that depends on plugins as a set or creating plugins as a set?

BTW, I finished the work on this PR by treating plugins as an array, I can make tests pass by just checking ast.plugins instanceof Set and change it with the rest of conversion process in the other PR

@jdesrosiers
Copy link
Copy Markdown
Collaborator

Sorry! I was confusing the plugins in AST with plugins in ValidationContext. You're right, it is a Set in the AST. It's a Set because those are keyword-specific plugins that only get added if the keyword is present in the schema and they should only get added once if the keyword is used multiple times.

So, you should expect a Set, serialize to an array, and restore the array as a Set. An please also fix the type declaration.

@Ahmedmhmud
Copy link
Copy Markdown
Author

No problem at all, I fixed the old part and also the type declaration.

Comment on lines +7 to +8
const ast = compiledSchema.ast ? { ...compiledSchema.ast } : undefined;
if (ast && ast.plugins instanceof Set) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ast is always present and is always a Set. These checks are unnecessary.

export const serialize = (compiledSchema) => {
const ast = compiledSchema.ast ? { ...compiledSchema.ast } : undefined;
if (ast && ast.plugins instanceof Set) {
ast.plugins = [...ast.plugins].map((plugin) => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern isn't great because it creates an intermediate array that isn't necessary. You can use @hyperjump/pact's map function that works on any iterable instead of just on functions (Pact.map((plugin) => ..., ast.plugins). Or, just use a normal for loop to build the array.

const ast = compiledSchema.ast ? { ...compiledSchema.ast } : undefined;
if (ast && ast.plugins instanceof Set) {
ast.plugins = [...ast.plugins].map((plugin) => {
if (!plugin?.id) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plugin should never be undefined. The ? is unnecessary.

if (!plugin?.id) {
throw Error("Cannot serialize plugin without id");
}
return { id: plugin.id };
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for this to be an object. return plugin.id should be sufficient.


return JSON.stringify(toSerialize, (_key, value) => {
if (value instanceof RegExp) {
return { [REGEXP_MARKER]: true, source: value.source, flags: value.flags };
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we slightly change the way this is represented, we can simplify the deserialize reviver function.

Suggested change
return { [REGEXP_MARKER]: true, source: value.source, flags: value.flags };
return { [REGEXP_MARKER]: { source: value.source, flags: value.flags } };

Comment on lines +30 to +36
if (!value || typeof value !== "object") {
return value;
}

if (value[REGEXP_MARKER]) {
return new RegExp(value.source, value.flags);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the previous suggested change, this can simply be ...

Suggested change
if (!value || typeof value !== "object") {
return value;
}
if (value[REGEXP_MARKER]) {
return new RegExp(value.source, value.flags);
}
if (key === REGEXP_MARKER) {
return new RegExp(value.source, value.flags);
}

return value;
});

if (Array.isArray(parsed?.ast?.plugins)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary. plugins will always be present and always be an array.

});

if (Array.isArray(parsed?.ast?.plugins)) {
parsed.ast.plugins = new Set(parsed.ast.plugins.map((plugin) => resolvePlugin(plugin.id, options)));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid the intermediate array with @hyperjump/pact.

Suggested change
parsed.ast.plugins = new Set(parsed.ast.plugins.map((plugin) => resolvePlugin(plugin.id, options)));
parsed.ast.plugins = Pact.pipe(
parsed.ast.plugins,
Pact.map((pluginUri) => resolvePlugin(pluginUri, options))
Pact.collectSet
);

};

const resolveBuiltInPlugin = (pluginId) => {
if (typeof pluginId !== "string" || !pluginId.endsWith("#plugin")) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pluginId would never not be a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants